Moran's I – A Swedish Story

🧭 What is Moran's I?

Moran's I is the most widely used measure of spatial autocorrelation — it tells us whether similar values tend to cluster together in space, or whether they are randomly or negatively arranged.

$$I = \dfrac{N}{W} \cdot \dfrac{\displaystyle\sum_{i}\sum_{j} w_{ij}\,(x_i - \bar{x})(x_j - \bar{x})}{\displaystyle\sum_{i}(x_i - \bar{x})^2}$$

NNumber of spatial units (regions)

WSum of all spatial weights wᵢⱼ

wSpatial weight matrix — encodes "who is neighbor to whom"

xVariable of interest (e.g., unemployment rate)

Interpretation
I ≈ +1 → Strong positive autocorrelation (clusters)
I ≈ 0 → Spatial randomness
I ≈ −1 → Dispersed / checkerboard pattern

Expected value under randomness: E[I] = −1/(N−1) ≈ 0 for large N

🇸🇪 The Data: Swedish NUTS-2 Regions

Sweden is divided into 8 NUTS-2 regions. We'll use simulated unemployment rates (%) that mirror real north–south patterns, where the more urbanized south tends to have lower unemployment.

Region	Code	Unemployment %

High (>7%)

Medium

Low (<5%)

Hover over regions to explore

🔗 Step 1: Build the Spatial Weights Matrix

The weights matrix W defines which regions are neighbors. The simplest approach is queen contiguity (shared border or corner). In PySAL, we use libpysal:

import geopandas as gpd
import libpysal
from esda.moran import Moran
import splot.esda as splot
import matplotlib.pyplot as plt

# Load Swedish NUTS-2 shapefile (or GeoPackage)
sweden = gpd.read_file("sweden_nuts2.gpkg")

# Create Queen contiguity weights
w = libpysal.weights.Queen.from_dataframe(sweden)
w.transform = 'r'   # row-standardize: each row sums to 1

# Inspect
print(w.n)           # 8 regions
print(w.mean_neighbors)  # average neighbors per region
    

Row-standardization (transform='r') means each weight wᵢⱼ = 1/nᵢ where nᵢ is the number of neighbors of i. This makes the spatial lag a proper local average of neighbors.

Weights Matrix (Queen Contiguity)

📐 Step 2: Compute Moran's I

# Compute Global Moran's I
mi = Moran(sweden['unemployment'], w)

print(f"Moran's I  = {mi.I:.4f}")
print(f"Expected I = {mi.EI:.4f}")
print(f"p-value    = {mi.p_sim:.4f}")   # based on 999 permutations
print(f"z-score    = {mi.z_sim:.4f}")
    

+0.612

Moran's I

0.012

p-value (999 perms)

Interpretation: I = 0.612 is substantially above the expected value (≈ 0). The p-value of 0.012 means there is only a 1.2% chance of observing this level of clustering by random chance. Unemployment rates in Sweden are significantly clustered in space. High-unemployment regions border other high-unemployment regions (mainly in the north), and low-unemployment regions cluster in the south.

📊 Step 3: The Moran Scatter Plot

The Moran scatter plot shows each region's standardized value (z) on the X axis vs. its spatial lag (average of neighbors' z-scores) on the Y axis. The slope of the regression line is Moran's I.

# Plot Moran scatter plot using splot
fig, ax = splot.esda.moran_scatterplot(mi, aspect_equal=True)
plt.title("Moran Scatter Plot – Swedish Unemployment")
plt.show()
    

HH (High-High)

LH

HL

LL (Low-Low)

Quadrant Meanings

Q1 – High-High (HH): Region has high unemployment AND its neighbors do too. Spatial cluster of disadvantage.

Q2 – Low-High (LH): Region is low but surrounded by high. Spatial outlier.

Q3 – Low-Low (LL): Region and its neighbors all have low unemployment. Cluster of prosperity.

Q4 – High-Low (HL): High region surrounded by low. Spatial outlier.

The slope of the red line = Moran's I ≈ 0.612

🔍 Step 4: Local Moran's I (LISA)

Global Moran's I gives one number for the whole country. Local Indicators of Spatial Association (LISA) decompose it to identify where clusters are.

from esda.moran import Moran_Local

# Compute Local Moran's I
lm = Moran_Local(sweden['unemployment'], w)

# lm.q tells us the quadrant: 1=HH, 2=LH, 3=LL, 4=HL
# lm.p_sim gives the pseudo p-value per region
sweden['lisa_q']   = lm.q
sweden['lisa_p']   = lm.p_sim
sweden['sig']      = lm.p_sim < 0.05

# Plot significance map
splot.esda.lisa_cluster(lm, sweden, p=0.05)
plt.show()
    

Result for Sweden: Northern regions (Norra Mellansverige, Övre Norrland) appear as HH clusters — significantly high unemployment surrounded by high unemployment. The Stockholm region appears as part of an LL cluster — low unemployment in a prosperous neighborhood.

🚀 Complete PySAL Workflow

import geopandas as gpd
import libpysal
from esda.moran import Moran, Moran_Local
import splot.esda as splot
import matplotlib.pyplot as plt
import numpy as np

# ── 1. Load data ─────────────────────────────────────────────
sweden = gpd.read_file("sweden_nuts2.gpkg")
y = sweden["unemployment"]

# ── 2. Weights ────────────────────────────────────────────────
w = libpysal.weights.Queen.from_dataframe(sweden)
w.transform = "r"

# ── 3. Global Moran's I ───────────────────────────────────────
mi = Moran(y, w)
print(f"I={mi.I:.3f}, p={mi.p_sim:.3f}")

fig, axs = plt.subplots(1, 2, figsize=(12, 5))

# Choropleth map
sweden.plot(column="unemployment", ax=axs[0],
             cmap="YlOrRd", legend=True)
axs[0].set_title("Unemployment Rate (%)")

# Moran scatter plot
splot.esda.moran_scatterplot(mi, ax=axs[1])
axs[1].set_title(f"Moran Scatter (I={mi.I:.3f})")
plt.tight_layout()
plt.show()

# ── 4. Local Moran's I (LISA) ────────────────────────────────
lm = Moran_Local(y, w, seed=42)
fig2, ax2 = plt.subplots(figsize=(6, 8))
splot.esda.lisa_cluster(lm, sweden, p=0.05, ax=ax2)
ax2.set_title("LISA Cluster Map – Sweden")
plt.show()
    

✅ Key Takeaways

1Spatial weights encode the neighborhood structure — always row-standardize for Moran's I

2Global Moran's I gives a single summary statistic for the whole study area

3Significance is assessed by permutation (999 random shuffles), not parametric tables

4Local Moran's I (LISA) identifies specific cluster locations and types (HH, LL, HL, LH)

Sweden's unemployment story: The north–south gradient in Swedish unemployment isn't random — it's spatially structured. Moran's I ≈ 0.61 with p < 0.05 tells us this pattern is unlikely to have arisen by chance.

Libraries used: libpysal, esda, splot, geopandas, matplotlib. Install with: pip install pysal splot geopandas